Add a check in non-debug mode for a T_NONE class by jhawthorn · Pull Request #573 · Shopify/ruby

jhawthorn · 2025-05-05T20:39:13Z

Although a crash could occur anywhere, one of the most common symptoms we see from getting a reference to a garbage collected object is crashing while attempting to call a method on it.

These crashes usually occur when trying to perform an rb_id_table_lookup in the "class" cc_tbl, where the class is usually another garbage collected object, because the freelist is stored in the class pointer. This of course isn't guaranteed to happen, as it only works for T_NONE.

This commit aims to have a better error message in this case, in hopes of having a better grouping of errors and to give a better hint to those investigating and triaging crashes.

Example output:

ptr = Fiddle.dlwrap(Object.new)
GC.start
Fiddle.dlunwrap(ptr).to_s

./test.rb:9: [BUG] attempted to search method 'to_s' on a garbage collected object
ruby 3.5.0dev (2025-05-05T16:29:43Z call_t_none_assert.. 7b419c752b) +PRISM [arm64-darwin24]

-- Crash Report log information --------------------------------------------

samuel-williams-shopify · 2025-05-06T11:26:26Z

Could we consider upstreaming this change? LGTM.

The "locale" encoding is only registered once, so we can use an atomic to avoid the full lock and hash lookup.

@b

* Reserve 2 bits for expressing object layout We would like to make instance variable reads in the JIT compiler faster (as well as simplify the JIT implementation). Currently, in order to read an instance variable, we have to: 1. Test for heap object 2. Load object to a 64 bit register 3. Mask the object header 4. Bit test against the masked header 5. JNE 6. Load field We would like to: 1. Test for heap object 2. Load object shape to a 32 bit register 3. Bit test against the shape 4. JNE 5. Load field The way we fetch instance variables is not consistent across objects. In order to realize our goal, we need to encode object layout inside the shape. If we encode object layout inside the shape, then the shape itself will guarantee that the access pattern generated by the JIT compiler is correct. We should encode the following load patterns into the shape tag bits. This way we can share shapes on transitions, but be able to differentiate the access patterns for the JIT compiler. In other words, two objects can have an `@a -> @b -> @c` transition and share the same shape, but the tag bits can differentiate the access pattern so that the JIT compiler can be confident that the machine code is correct. Here are the patterns: 1. Embedded/Extended T_OBJECT Instance Variables Objects with direct references to instance variables or via malloc buffer 2. Objects with fields_objects fields These are Data and TypedData objects. They have an associated axillary imemo/fields object that stores the instance variables. The access pattern is `object[2] + 2`. The fields object is the 3rd field, and the instance variables start at +2 inside the fields object. The fields object itself is a Ruby object, so it contains the usual header bits + class headers. 3. Non Boxable Classes / Modules This is similar to Objects with fields_objects, but the fields object is stored at a different offset. We’re differentiating this from boxable classes and modules because those are harder to support. 4. Other "Other" pattern is for objects that are rare, or have difficult-to-implement access patterns. This includes: * Boxable classes and modules * Structs (for now) * Objects that use the geniv table Proposed shape bit layout: ``` Current shape_id_t is 32 bits: 31 28 27 26 25 24 23 22 19 18 0 +-----------+--+--+--+--+--+------------+----------------------------+ | unused |L1|L0|OI|FR|CX| heap index | shape tree offset | +-----------+--+--+--+--+--+------------+----------------------------+ | | | | | | | | | | | | | +-- bits 0-18: SHAPE_ID_OFFSET_MASK | | | | | +--------------- bits 19-22: SHAPE_ID_HEAP_INDEX_MASK | | | | +------------------ bit 23: SHAPE_ID_FL_COMPLEX | | | +--------------------- bit 24: SHAPE_ID_FL_FROZEN | | +------------------------ bit 25: SHAPE_ID_FL_HAS_OBJECT_ID +--+--------------------------- bits 26-27: SHAPE_ID_LAYOUT_MASK ``` The important part about these layout patterns is that they do not reflect the _type_ of object, only how the object is laid out in memory. For example, we currently treat structs as "other", but we can refactor them to have the same layout as "Objects with fields_objects", and when we do that they should get a different bit in the shape header. This commit only reserves the two bits, it doesn't use them in the JIT compiler yet. Co-Authored-By: John Hawthorn <john@hawthorn.email> Co-Authored-By: Max Bernstein <tekknolagi@gmail.com> * Update gc.c Co-authored-by: Nobuyoshi Nakada <nobu.nakada@gmail.com> * Update shape.h Co-authored-by: Jean Boussier <jean.boussier@gmail.com> * fix function name * Update shape.c Co-authored-by: Jean Boussier <jean.boussier@gmail.com> * fix function name * Revert "Update shape.c" This reverts commit 900711d. * add comment --------- Co-authored-by: John Hawthorn <john@hawthorn.email> Co-authored-by: Max Bernstein <tekknolagi@gmail.com> Co-authored-by: Nobuyoshi Nakada <nobu.nakada@gmail.com> Co-authored-by: Jean Boussier <jean.boussier@gmail.com>

This reverts commit 63d9f09.

Replace RSA keys for intermediate_key and ee_key with RSA 4096-bit keys rsa-1.pem and rsa-2.pem. At least RSA 2048-bit keys are required for signing and encryption in FIPS. SP 800-131A Rev. 2 * 3. Digital Signatures * 6. Key Agreement and Key Transport Using RSA https://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-131Ar2.pdf https://github.com/openssl/openssl/blob/71943544885ff364a10bcc5ffc62d0e651c9a021/providers/common/securitycheck.c#L72-L73 ``` $ openssl rsa -in test/openssl/fixtures/pkey/rsa-1.pem -text -noout | head -1 Private-Key: (4096 bit, 2 primes) $ openssl rsa -in test/openssl/fixtures/pkey/rsa-2.pem -text -noout | head -1 Private-Key: (4096 bit, 2 primes) ``` ruby/openssl@f130312442

Feeding a deeply nested constructed encoding to OpenSSL::ASN1.decode, .decode_all, or .traverse can cause unbounded recursion and result in SystemStackError. Add an explicit nesting depth limit of 200 levels and raise OpenSSL::ASN1::ASN1Error if it is exceeded. This limit is arbitrary and currently not configurable, but should be sufficient for any practical use cases. Fixes https://hackerone.com/reports/3662125 ruby/openssl@fc753239cc

`"%" PRIsVALUE` just copies/appends the corresponding argument as a string.

@jhawthorn

Suggested by @jhawthorn.

To use tables mainly.

gc/mmtk/mmtk.c:505:1: warning: function 'rb_mmtk_gc_thread_panic_handler' could be declared with attribute 'noreturn' [-Wmissing-noreturn] 505 | { | ^ gc/mmtk/mmtk.c:987:27: warning: variable length array folded to constant array as an extension [-Wgnu-folding-constant] 987 | char obj_info_buf[info_size]; | ^~~~~~~~~ gc/mmtk/mmtk.c:990:34: warning: variable length array folded to constant array as an extension [-Wgnu-folding-constant] 990 | char parent_obj_info_buf[info_size]; | ^~~~~~~~~ ruby/mmtk@8ce21cdf10

Bumps the github-actions group with 1 update in the / directory: [taiki-e/install-action](https://github.com/taiki-e/install-action). Updates `taiki-e/install-action` from 2.79.11 to 2.81.1 - [Release notes](https://github.com/taiki-e/install-action/releases) - [Changelog](https://github.com/taiki-e/install-action/blob/main/CHANGELOG.md) - [Commits](taiki-e/install-action@13608cb...e49978b) --- updated-dependencies: - dependency-name: taiki-e/install-action dependency-version: 2.81.1 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: github-actions ... Signed-off-by: dependabot[bot] <support@github.com>

In hash_replace_ref, it currently evaluates rb_gc_location twice for the key and value. We can reduce that to once each.

Document missing gemrc configuration keys. The documented configuration key order aligns with the following part. https://github.com/ruby/rubygems/blob/287175d4ae57340e22c6f1d21da1489d7dac35c5/lib/rubygems/config_file.rb#L240-L257 Add and update tests for missing gemrc configuration use_psych and gemhome key assertions. Fix test_initialize_global_gem_cache_gemrc and test_initialize_concurrent_downloads to test with symbolic keys, as global_gem_cache and use_psych keys are documented as a symbolic key. ruby/rubygems@8baf4d49a4

(ruby/strscan#205) See also: https://bugs.ruby-lang.org/issues/21943 This is semantically equivalent to `scanner[specifier]&.to_i(base)` but this is faster than `scanner[specifier]&.to_i(base)` because `integer_at` doesn't create a temporary String when possible. This PR also includes a benchmark for them: ```console $ ruby -v -S benchmark-driver benchmark/integer_at.yaml ruby 4.1.0dev (2026-05-01T19:25:51Z master ruby/strscan@f2845eab29) +PRISM [x86_64-linux] Warming up -------------------------------------- [].to_i 24.272M i/s - 25.109M times in 1.034481s (41.20ns/i, 32clocks/i) integer_at 61.188M i/s - 62.491M times in 1.021289s (16.34ns/i, 62clocks/i) Calculating ------------------------------------- [].to_i 26.831M i/s - 72.816M times in 2.713883s (37.27ns/i, 169clocks/i) integer_at 81.331M i/s - 183.564M times in 2.256998s (12.30ns/i, 43clocks/i) Comparison: integer_at: 81331225.5 i/s [].to_i: 26831046.3 i/s - 3.03x slower ``` In this environment, `integer_at` is 3.03x faster than `[].to_i`. ruby/strscan@8a60879b2d Co-authored-by: jinroq <jinroq@gmail.com>

The earlier `rake vendor:compact_index` hook into `dev:deps` and the hard-copy step in ruby-core.yml fell apart in ruby/ruby's test-bundler runner, which sets TMPDIR per process and does not invoke our rake tasks. Pull the fetch logic into `Spec::Rubygems.install_vendored_compact_index` and call it from `install_test_deps` so every test setup path - local `bin/rspec`, `bin/parallel_rspec`, GHA bundler.yml, and ruby/ruby's test-bundler - lands the files at `Path.tmp_root.join("compact_index")` exactly where the artifice already looks. The standalone rake task and its workflow hop are no longer needed. ruby/rubygems@d8536e115e Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

… is set Previously the install_vendored_compact_index short-circuit only checked whether `tmp/compact_index/lib/compact_index.rb` existed, so once any ref was vendored a subsequent `COMPACT_INDEX_REF=<sha> bin/rspec ...` kept serving the stale copy. Drop the vendor tree first when the env var is explicitly set so an override always re-fetches against the requested ref. ruby/rubygems@db5d06953f Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

The vendored compact_index install ran without any coordination, so two test setups starting at once could both write into tmp/compact_index/ at the same time. The skip guard also checked a single file, which meant an interrupted download leaving only lib/compact_index.rb behind would be treated as a complete vendor tree on the next run. Take an exclusive file lock around the install and only skip the download once every expected file is present, removing the tree under the same lock when COMPACT_INDEX_REF forces a refresh. ruby/rubygems@0451700769 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

rb_gc_modular_gc_loaded_p and rb_gc_active_gc_name are only used when compiling with modular GC enabled.

Prism accepted a modifier conditional as the predicate of an `if`/`unless` (and `elsif`), while parse.y rejects it: if a if b then end # parse.y: SyntaxError, prism: accepted unless a unless b then end The `if`/`unless`/`elsif` predicate was parsed with a floor of `PM_BINDING_POWER_MODIFIER`, which is inclusive of the modifier conditionals (`if`/`unless`/`while`/`until`, left binding power `PM_BINDING_POWER_MODIFIER`), so they were absorbed into the predicate. `while`/`until` predicates already use `PM_BINDING_POWER_COMPOSITION` and reject these correctly. Raise the floor to `PM_BINDING_POWER_MODIFIER + 1` so the predicate still absorbs `and`/`or` (and tighter operators) but excludes the modifier conditionals, matching parse.y and aligning `if`/`unless` with `while`/`until`. ruby/prism@0fa8d8f8d8

…dicate floor The `+ 1` form is meant to express associativity, not a binding-power floor; use the named constant instead, matching while/until conditions. ruby/prism@53d1ee76c6

`gc_prof_mark_timer_start` and `gc_prof_mark_timer_stop` include DTrace hooks for the `MARK_BEGIN` and `MARK_END` events, respectively. Previously, those probes are only triggered in `gc_marks`. However, `gc_marks_continue` and `gc_rest` also contain marking activities, but are not captured by the probes. We move the invocation of `gc_prof_mark_timer_start` and `gc_prof_mark_timer_stop` into `gc_marking_enter` and `gc_marking_exit` to ensure all marking activities are captured by the probes.

…uby#17158) This reverts commit ddb5055.

Fold arithmetic identity operations such as `x + 0`, `x - 0`, `x * 1`, `x / 1` in `fold_constants`.

When VM state is corrupted enough, we can call abort() from the SIGABRT handler. Previously, we would spam until the stack is full: ABRT received in SEGV handler SEGV received in ABRT handler ABRT received in SEGV handler ABRT received in ABRT handler ABRT received in ABRT handler ABRT received in ABRT handler [...] We've seen this on CI: https://github.com/ruby/ruby/actions/runs/26591192708/job/78350130653 To test this situation locally, temporarily patch in a call to abort() in rb_bug_for_fatal_signal() then use `Process.kill(:ABRT, Process.pid)`. Fix by restoring the default signal handler before aborting.

The completeness check skips the download once every expected file exists, but File.write truncates before writing, so an install interrupted partway leaves a half-written file in place. A later run would see it, treat the vendor tree as complete, and load a truncated source. Download each file to a sibling .tmp path and rename it into place. The rename stays within the vendor tree, so it is atomic and a target file is only ever observed fully written. ruby/rubygems@a2f265f579 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

It already implictly is on recent rubies because it has no mark function, but might as well make it explicit. ruby/json@8d7f975b01

This makes it much easier to debug. ruby/json@bbcf0a3254

…arser Add the `allow_comments: true` parsing option. ruby/json@138b9a2c49

ruby/json@8972c6da6e

Suppres it for now because JRuby version does not support it yet. ruby/json@8144d4cb34

…from the lockfile ruby/rubygems@66dd16f025

The legacy dependency API exposes no per-version publish dates, so the cooldown filter has nothing to compare against and silently does nothing. Document that limitation as a regression guard rather than a surprise. ruby/rubygems@ce952c1b9f Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

A single converge can hold several rubygems sources, each keyed by its own remotes. A partial update re-converges the still-locked sources, the path that used to drop cooldown, so lock in that the cooldown stays attached to the source that declared it for both a top-level source and a gem-block source. ruby/rubygems@2fcb2d70cc Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

bundle add re-resolves the Gemfile against the existing lockfile via the injector, which converges sources the same way install and update do. A gem added there must inherit the source's cooldown instead of grabbing an in-window release. ruby/rubygems@cfed95d1dd Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

This is the exact path from the original report: regenerating the lockfile must not advance a gem into the cooldown window. Assert the written lockfile to guard the lock-only flow that install and update specs do not touch. ruby/rubygems@293d7c7dc3 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

A frozen install reads the lockfile rather than resolving, so cooldown never runs. Document that a version locked inside the window still installs, so the bypass stays intentional. ruby/rubygems@bfc099bc26 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

A mirror rewrites the fetch URI while cooldown stays keyed by the URI declared in the Gemfile. Confirm the redirect to the serving mirror does not lose the cooldown. ruby/rubygems@ea2bb164f1 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Stamping each solo_gem with its own Time.now.utc lets the two dates drift apart and matches neither the surrounding before block. Snapshot the time once so the cooldown window stays stable as thresholds tighten. ruby/rubygems@69c6d876d4 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Although a crash could occur anywhere, one of the most common symptoms we see from getting a reference to a garbage collected object is crashing while attempting to call a method on it. These crashes usually occur when trying to perform an rb_id_table_lookup in the "class" cc_tbl, where the class is usually another garbage collected object, because the freelist is stored in the class pointer. This commit aims to have a better error message in this case, in hopes of having a better grouping of errors and to give a better hint to those investigating and triaging crashes.

jhawthorn and others added 29 commits May 29, 2026 11:29

Atomic fast path for "locale" encoding registration

d901035

The "locale" encoding is only registered once, so we can use an atomic to avoid the full lock and hash lookup.

Revert "Reserve 2 bits for expressing object layout (ruby#17139)"

ddb5055

This reverts commit 63d9f09.

[DOC] Improve docs for Method#source_location

f77290c

Update the RBS test for ruby/rbs#2981

2750e9a

Reduce unnecessary intermediate strings

f78cd18

`"%" PRIsVALUE` just copies/appends the corresponding argument as a string.

Free already closed IO immediately

9c2ca0e

Suggested by @jhawthorn.

[DOC] Fix typos in String#dump and String#bytesplice docs

8a12816

Make TracePoint support compaction

5e7a7f9

[DOC] Move packed_data from rdoc to markdown

1589add

To use tables mainly.

Make hash_replace_ref to evaluate rb_gc_location once

7225c79

In hash_replace_ref, it currently evaluates rb_gc_location twice for the key and value. We can reduce that to once each.

Wrap functions in USE_MODULAR_GC

344b5ea

rb_gc_modular_gc_loaded_p and rb_gc_active_gc_name are only used when compiling with modular GC enabled.

Update to ruby/spec@6b3b96d

607df5d

[ruby/prism] Use PM_BINDING_POWER_COMPOSITION for the conditional pre…

c49e778

…dicate floor The `+ 1` form is meant to express associativity, not a binding-power floor; use the named constant instead, matching while/until conditions. ruby/prism@53d1ee76c6

Reapply "Reserve 2 bits for expressing object layout (ruby#17139)" (r…

c4d09f4

…uby#17158) This reverts commit ddb5055.

ZJIT: Fold arithmetic identity operations (ruby#17131)

9c62016

Fold arithmetic identity operations such as `x + 0`, `x - 0`, `x * 1`, `x / 1` in `fold_constants`.

eregon and others added 27 commits June 6, 2026 12:53

Update to ruby/spec@e695ce6

937a1d9

Extract str_to_cstr

b08d3f8

Constify local pointer variables in string.c

2d9827d

[DOC] Fnmatch examples doc

925fa46

sprintf.c: Refactor CHECK macro

99d0601

sprintf.c: Fix width overflow

e39475a

Skip the hang-up test in mmtk [ci skip]

2c72ee1

[ruby/json] Mark JSON_Parser_frame_stack_type as WB protected

c022ecc

It already implictly is on recent rubies because it has no mark function, but might as well make it explicit. ruby/json@8d7f975b01

[ruby/json] Compile UNREACHABLE_RETURN into rb_bug when in debug mode

c332b80

This makes it much easier to debug. ruby/json@bbcf0a3254

IO::Buffer: Validate the mask argument of bit operations

4bd3e14

[DOC] Fix missing parentheses in Kernel#print

be9725c

[ruby/json] Deprecate default support of JavaScript comments in the p…

fe5df19

…arser Add the `allow_comments: true` parsing option. ruby/json@138b9a2c49

[ruby/json] parser.c: Precompute JSON::ParserError and ivar IDs

ccca0be

ruby/json@8972c6da6e

[DOC] Improve docs for ObjectSpace.each_object

6be7b1f

[ruby/json] Suppres the warning for comment in JSON

86cf1e8

Suppres it for now because JRuby version does not support it yet. ruby/json@8144d4cb34

[ruby/rubygems] Preserve per-source cooldown when converging sources …

91023db

…from the lockfile ruby/rubygems@66dd16f025

IO::Buffer: Validate the buffer after argument conversion

2c8002d

IO::Buffer: Validate the buffer after type argument conversion

773e0c3

IO::Buffer: Avoid inadvertent ID creation

09e76c6

jhawthorn force-pushed the call_t_none_assertion branch from d46b726 to 9f5fc98 Compare June 8, 2026 17:36

jhawthorn closed this Jun 8, 2026

jhawthorn deleted the call_t_none_assertion branch June 8, 2026 18:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a check in non-debug mode for a T_NONE class#573

Add a check in non-debug mode for a T_NONE class#573
jhawthorn wants to merge 188 commits into
masterfrom
call_t_none_assertion

jhawthorn commented May 5, 2025

Uh oh!

samuel-williams-shopify commented May 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Conversation

jhawthorn commented May 5, 2025

Uh oh!

samuel-williams-shopify commented May 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants